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PROTEIN ACTIVITY SCREENING 
OF CLONES HAVING DNA FROM UNCULTIVATED MICROORGANISMS 



This application is a divisional application of U.S. Patent Application Serial No. 
08/657,409, which was filed on June 3, 1996, which was a continuation-in-part of U.S. 
application Serial No. 08/568,994 which was filed on December 7, 1995 (copending) which is a 
continuation-in-part of U.S. application Serial No. 08/503,606 which was filed on July 18, 1995 
(copending). 

This invention relates to the field of preparing and screening libraries of clones 
containing microbially derived DNA. 

Naturally occurring assemblages of microorganisms often encompass a bev^ldering array 
of physiological and metabolic diversity. In fact, it has been estimated that to date less than one 
percent of the world's organisms have been cultured. It has been suggested that a large fi:action 
of this diversity thus far has been unrecognized due to difficulties in enriching and isolating 
microorganisms in pure culture. Therefore, it has been difficult or impossible to identify or 
isolate valuable proteins, e.g. enzymes, from these samples. These limitations suggest the need 
for altemative approaches to characterize the physiological and metabolic potential, i e. activities 
of interest of as-yet uncultivated microorganisms, which to date have been characterized solely 
by analyses of PGR amplified rRNA gene firagments, clonally recovered fi*om mixed assemblage 
nucleic acids. 

In one aspect, the invention provides a process of screening clones having DNA fi-om an 
uncultivated microorganism for a specified protein, e.g. enzyme, activity which process 
comprises: 

screening for a specified protein, e.g enzyme, activity in a library of clones prepared by 
(i) recovering DNA firom a DNA population derived fi-om at least one uncultivated 
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microorganism; and 

(ii) transforming a host with recovered DNA to produce a library of clones which are 
screened for the specified protein, e.g. enzyme, activity. 

The library is produced from DNA which is recovered without culturing of an organism, 
particularly where the DNA is recovered from an environmental sample containing 
microorganisms which are not or cannot be cultured. 

In a preferred embodiment DNA is ligated into a vector, particularly wherein the vector 
further comprises expression regulatory sequences which can control and regulate the production 
of a detectable proteins, e.g. enzyme, activity fi-om the ligated DNA. 

The f-factor (or fertility factor) in E. cold is a plasmid which effects high fi-equency 
transfer of itself during conjugation and less firequent transfer of the bacterial chromosome itself. 
To achieve and stably propogate large DNA fragments from mixed microbial samples, a 
particularly preferred embodiment is to use a cloning vector containing an f-factor origin of 
replication to generate genomic libraries that can be replicated with a high degree of fidelity. 
When integrated with DNA from a mixed uncultured environmental sample, this makes it 
possible to achieve large genomic fragments in the form of a stable "environmental DNA 
library." 

In another preferred embodiment, double stranded DNA obtained from the xincultivated 
DNA population is selected by: 

converting the double stranded genomic DNA into single stranded DNA; 

recovering from the converted single stranded DNA single stranded DNA which 
specifically binds, such as by hybridization, to a probe DNA sequence; and 

converting recovered single stranded DNA to double stranded DNA. 

The probe may be directly or indirectly bound to a solid phase by which it is separated 
from single stranded DNA which is not hybridized or otherwise specifically bound to the probe. 



The process can also include releasing single stranded DNA from said probe after 
recovering said hybridized or otherwise bound single stranded DNA and amplifying the single 
stranded DNA so released prior to converting it to double stranded DNA. 

The invention also provides a process of screening clones having DNA from an 
uncultivated microorganisms for a specified protein, e.g en-yme, activity v^hich comprises 
screening for a specified gene cluster protein product activity in the library of clones prepared 
by: (i) recovering DNA from a DNA population derived from at least one uncultivated 
microorganism; and (ii3 transforming a host with recovered DNA to produce a library of clones 
with the screens for the specified protein, e.g enzyme, activity. The library is produced from 
gene cluster DNA which is recovered without culturing of an organism, particularly where the 
DNA gene clusters are recovered from an environmental sample containing microorganisms 
which are not or cannot be cultured. 

Altematively, double-stranded gene cluster DNA obtained from the uncuhivated DNA 
population is selected by converting the double-stranded genomic gene cluster DNA into single- 
stranded DNA; recovering from the converted single-stranded gene cluster polycistron DNA, 
single-stranded DNA which specifically binds, such as by 

hybridization, to a polynucleotide probe sequence; and converting recovered singlestranded gene 
cluster DNA to double-stranded DNA. 

These and other aspects of the present invention are described with respect to particular 
preferred embodiments and v^U be apparent to those skilled in the art from the teachings herein. 

The microorganisms from which the libraries may be prepared include prokaryotic 
microorganisms, such as Eubacteria and Archacbacteria, and lower eukaryotic microorganisms 
such as fungi, some algae and protozoa. The microorganisms are uncultured microorganisms 
obtained from environmental samples and such microorganisms may be extremophiles, such as 
thermophiles, hyperthermophiles, psychrophiles, psychrotrophs, etc. 



As indicated above, the library is produced from DNA which is recovered without 
culturing of an organism, particularly where the DNA is recovered from an environmental 
sample containing microorganisms which are not or cannot be cultured. Sources of 
microorganism DNA as a starting material library from which DNA is obtained are particularly 
contemplated to include environmental samples, such as microbial samples obtained from Arctic 
and Antarctic ice, water or permafrost sources, materials of volcanic origin, materials from soil 
or plant sources in tropical areas, etc. Thus, for example, genomic DNA may be recovered from 
either uncultured or non-culturable organism and employed to produce an appropriate library of 
clones for subsequent determination of protein, e.g enzyme, activity. 

Bacteria and many eukaryotes have a coordinated mechanism for regulating genes whose 
products are involved in related processes. The genes are clustered, in structures referred to as 
"gene clusters," on a single chromosome and are transcribed together under the control of a 
single regulatory sequence, including a single promoter which initiates transcription of the entire 
cluster. The gene cluster, the promoter, and additional sequences that function in regulation 
altogether are referred to as an "operon" and can include up to 20 or more genes, usually from 2 
to 6 genes. Thus, a gene cluster is a group of adjacent genes that are either identical or related, 
usually as to their fimction. 

Some gene families consist of identical members. Clustering is a prerequisite for 
maintaining identity between genes, although clustered genes are not necessarily identical Gene 
clusters range from extremes where a duplication is generated to adjacent related genes to cases 
where hundreds of identical genes lie in a tandem array. Sometimes no significance is 
discemable in a repetition of a particular gene. A principal example of this is the expressed 
duplicate insulin genes in some species, whereas a single insulin gene is adequate in other 
mammalian species. 

It is important to fiirther research gene clusters and the extent to which the fiiU length of 
the cluster is necessary for the expression of the proteins resulting therefrom. Further, gene 
clusters undergo continual reorganization and, thus, the ability to create heterogeneous libraries 



of gene clusters from, for example, bacterial or other prokaryote sources is valuable in 
determining sources of novel proteins, particularly including proteins, e.g. enzymes, such as, for 
example, the polyketide syntheses that are responsible for the synthesis of polyketides having a 
vast array of useful activities. Other types of proteins that are the product(s) of gene clusters are 
also contemplated, including, for example, antibiotics, antivirals, antitumor agents and 
regulatory proteins, such as insulin. 

Polyketides are molecules which are an extremely rich source of bioactivities, including 
antibiotics (such as tetracyclines and erythromycin), anti-cancer agents (daunomycin), 
immunosuppressants (FK506 and rapamycin), and veterinary products (monensin). Many 
polyketides (produced by polyketide syntheses) are valuable as therapeutic agents. Polyketide 
syntheses are multifunctional proteins, e.g. enzymes, that catalyze the biosynthesis of a hugh 
variety of carbon chains differing in length and patterns of functionality and cyclization. 
Polyketide synthase genes fall into gene clusters and at least one type (designated type I) of 
polyketide syntheses have large size genes and proteins, e.g. enzymes, complicating genetic 
manipulation and in v'^ro studies of these genes/proteins. 

The ability to select and combine desired components from a library of polyketides and 
postpolyketide biosynthesis genes for generation of novel polyketides for study is appealing. The 
method(s) of the present invention make it possible to and facilitate the cloning of novel 
polyketide syntheses, since one can generate gene banks with clones containing large inserts 
(especially when using the f-factor based vectors), which facilitates cloning of gene clusters. 

Preferably, the gene cluster DNA is iigated into a vector, particularly wherein a vector 
further comprises expression regulatory sequences which can control and regulate the production 
of a detectable protein or protein-related array activity from the Iigated gene clusters. Use of 
vectors which have an exceptionally large capacity for exogenous DNA introduction are 
particularly appropriate for use with such gene clusters and are described by way of example 
herein to include the f-factor (or fertility factor) of E. coli. This f-factor of E. cold is a plasmid 
which affect highfrequency transfer of itself during conjugation and is ideal to achieve and stably 



propagate large DNA fragments, such as gene clusters frorr. mixed microbial samples. 

The term "derived" or "isolated" means that material is removed from its original 
environment (e.g., the natural environment if it is naturally occurring). For example, a naturally- 
occurring polynucleotide or polypeptide present in a living animal is not isolated, but the same 
polynucleotide or polypeptide separated from some or all of the coexisting materials in the 
natural system, is isolated. 

The DNA isolated or derived from these microorganisms can preferably be inserted into a 
vector prior to probing for selected DNA. Such vectors are preferably those containing 
expression regulatory sequences, including promoters, enhancers and the like. Such 
polynucleotides can be part of a vector and/or a composition and still be isolated, in that such 
vector or composition is not part of its natural environment. Particularly preferred phage or 
plasmid and methods for introduction and packaging into them are described in detail in the 
protocol set forth herein. 

The following outlines a general procedure for producing libraries from nonculturable 
organisms, which libraries can be probed to select therefrom DNA sequences which hybridize to 
specified probe DNA: 

Obtain Biomass 

DNA Isolation 

Shear DNA (25 gauge needle) 

Blunt DNA (Mimg Bean Nuclease) 

Methylate (EcoR I Methylase) 

Ligate to EcoR I linkers (GGAATTCC) 

Cut back linkers (EcoR I Restriction Endonuclease) 

Size Fractionate (Sucrose Gradient) 

Ligate to lambda vector (Lambda ZAP® (Stratagene) and gtl 1) 
Package (in vitro lambda packaging extract) 
Plate on E. cold host and amplify 



The probe DNA used for selectively recovering DNA of interest from the DNA derived 
from the at least one uncultured microorganism can be a full-length coding region sequence or a 
partial coding region sequence of DNA for an protein, e.g. enzyme, of known activity, a 
phylogenetic marker or other identified DNA sequence. The original DNA library can be 
preferably probed using mixtures of probes comprising at least a portion of the DNA sequence 
encoding the specified activity. These probes or probe libraries are preferably single-stranded 
and the microbial DNA v^hich is probed has preferably been converted into single-stranded form. 
The probes that are particularly suitable are those derived fi:om DNA encoding proteins, e.g. 
enzymes, having an activity similar or identical to the specified protein, e.g enzyme, activity 
which is to be screened. 

The probe DNA should be at least about 10 bases and preferably at least 15 bases. In one 
embodiment, the entire coding region may be employed as a probe. Conditions for the 
hybridization in which DNA is selectively isolated by the use of at least one DNA probe will be 
designed to provide a hybridization stringency of at least about 50% sequence identity, more 
particularly a stringency providing for a sequence identity of at least about 70%. 

Hybridization techniques for probing a microbial DNA library to isolate DNA of 
potential interest are well known in the art and any of those which are described in the literature 
are suitable for use herein, particularly those which use a solid phasebound, directly or indirectly 
bound, probe DNA for ease in separation firom the remainder of the DNA derived fi-om the 
microorganisms. 

Preferably the probe DNA is "labeled" with one partner of a specific binding pair (i.e. a 
ligand) and the other partner of the pair is bound to a solid matrix to provide ease of separation 
of target fi-om its source. The ligand and specific binding partner can be selected fi-om, in either 
orientation, the following: (1) an antigen or hapten and an antibody or specific binding fi:^gment 
thereof; (2) biotin or iminobiotin and avidin or streptavidin; (3) a sugar and a lectin specific 
therefor; (4) a protein, e.g. enzyme, and an inhibitor therefor; (5) an apoenzyme and cofactor; (6) 
complementary homopolymeric oligonucleotides; and (7) a hormone and a receptor therefor. The 



solid phase is preferably selected from: (1) a glass or polymeric surface; (2) a packed column of 
polymeric beads: and (3) magnetic or paramagnetic particles. 

The library of clones prepared as described above can be screened directly for enzymatic 
activity v^ithout the need for culture expansion, amplification or other supplementary procedures. 
However, in one preferred embodiment, it is considered desirable to amplify the DNA recovered 
from the individual clones such as by 
PGR. 

Further, it is optional but desirable to perform an amplification of the target DNA that has 
been isolated. In this embodiment the selectively isolated DNA is separated fi^om the probe DNA 
after isolation. It is then amplified before being used to transform hosts. The double stranded 
DNA selected to include as at least a portion thereof a predetermined DNA sequence can be 
rendered single stranded, subjected to amplification and reannealed to provide amplified 
numbers of selected double stranded DNA. Numerous amplification methodologies are now well 
known in the art. 

The selected DNA is then used for preparing a library for screening by transforming a 
suitable organism. Hosts, particularly those specifically identified herein as preferred, are 
transformed by artificial introduction of the vectors containing the target DNA by inoculation 
under conditions conducive for such transformation. 

The resultant libraries of transformed clones are then screened for clones which display 
activity for the protein, e.g. enzyme, of interest in a phenotypic assay for protein, e.g. enzyme, 
activity. 

Having prepared a multiplicity of clones from DNA selectively isolated fi-om an 
organism, such clones are screened for a specific protein, e.g. enzyme, activity and to identify 
the clones having the specified protein, e.g enzyme, characteristics. 
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Tlie screening for protein, e.g. enzyme, activity may be effected on individual expression 
clones or may be initially effected on a mixture of expression clones to ascertain whether or not 
the mixture has one or more specified protein, e.g. enzyme, activities. If the mixture has a 
specified protein, e.g. er~yme, activity, then the individual clones may be rescreened for such 
protein, e.g. enzyme, activity or for a more specific activity. Thus, for example, if a clone 
mixture has hydrolase activity, then the individual clones may be recovered and screened to 
determine which of such clones has hydrolase activity. 

The DNA derived from a microorganism(s) is preferably inserted into an appropriate 
vector (generally a vector containing suitable regulatory sequences for effecting expression) 
prior to subjecting such DNA to a selection procedure to select and isolate therefrom DNA 
which hybridizes to DNA derived from DNA encoding an proteins, e.g. enzyme(s), having the 
specified protein, e.g. enzyme, activity. 

As representative examples of expression vectors which may be used there may be 
mentioned viral particles, baculovirus, phage, plasmids, phagemids. cosmids, phosmids, bacterial 
artificial chromosomes, viral DNA (e.g. vaccinia, adenovirus, foul pox virus, pseudorabies and 
derivatives of SV40), Pl-based artificial chromosomes, yeast plasmids, yeast artificial 
chromosomes, and any other vectors specific for specific hosts of interest (such as bacillus, 
aspergillus, yeast, etc.) Thus, for example, the DNA may be included in any one of a variety of 
expression vectors for expressing a polypeptide. Such vectors include chromosomal, 
nonchromosomal and synthetic DNA sequences. Large numbers of suitable vectors are known to 
those of skill in the art, and are commercially available. The following vectors are provided by 
way of example; Bacterial: pQE70, pQE60, pQE-9 (Qiagen), pBLUESCRIPT SK, 
pBLUESCRIPT KS (Stratagene); pTRC99a, pKJa23-3, pDR540, pRIT2T (Pharmacia); 
Eukaryotic: pWLNEO, pXTI, pSG5 (Stratagene) pSVK3, pBPV, pMSG, pSVLSV40 
(Pharmacia). However, any other plasmid or vector may be used as long as they are replicable 
and viable in the host. 



A particularly preferred type of vector for use in the present invention contains an f- 
factor origin of replication. The f-factor (or fertility factor) in E. coli is a plasmid which effects 
high frequency transfer of itself during conjugation and less frequent transfer of the bacterial 
chromosome itself A particularly preferred embodiment is to use cloning vectors, referred to as 
"fosmids" or bacterial artificial chromosome (BAG) vectors. These are derived from the E. cold 
f-factor and are able to stably integrate large segments of genomic DNA. When integrated with 
DNA from a mixed uncultured environmental sample, this makes it possible to achieve large 
genomic fragments in the form of a stable "environmental DNA library," 

The DNA derived from a microorganism(s) may be inserted into the vector by a variety 
of procedures. In general, the DNA sequence is inserted into an appropriate restriction 
endonuclease site(s) by procedures known in the art. Such procedures and others are deemed to 
be within the scope of those skilled in the art. 

The DNA sequence in the expression vector is operatively linked to an appropriate 
expression control sequence(s) (promoter) to direct mRNA synthesis. Particular named bacterial 
promoters include lad, lacZ, T3, T7, apt, lambda PR, PL and trp. Eukaiyotic promoters include 
CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from retrovirus, and 
mouse metallotnionein-I. Selection of the appropriate vector and promoter is well within the 
level of ordinary skill in the art. The expression vector also contains a ribosome binding site for 
translation initiation and a transcription terminator. The vector may also include appropriate 
sequences for amplifying expression. Promoter regions can be selected from any desked gene 
using CAT (chloramphenicol transferase) vectors or other vectors with selectable markers. 

In addition, the expression vectors preferably contain one or more selectable marker 
genes to provide a phenotypic trait for selection of transformed host cells such as dihydrofolate 
reductase or neomycin resistance for eukaryotic cell culture, or such as tetracycline or ampicillin 
resistance in E. coli. 
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Generally, recombinant expression vectors will include origins of replication and 
selectable markers permitting transformation of the host cell, e,g, the ampicillin resistance gene 
of E, cold and S. cerevisiae TRPl gene, and a promoter derived from a highly-expressed gene to 
direct transcription of a downstream structural sequence. Such promoters can be derived from 
operons encoding glycolytic proteins, e.g enzymes, such as 3-phosphoglycerate kinase (PGK), a- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural 
sequence is assembled in appropriate phase with translation initiation and termination sequences, 
and preferably, a leader sequence capable of directing secretion of translated protein into the 
periplasmic space or extracellular medium. 

The DNA selected and isolated as hereinabove described is introduced into a suitable 
host to prepare a library which is screened for the desired protein, e.g. enzyme, activity. The 
selected DNA is preferably already in a vector which includes appropriate control sequences 
whereby selected DNA which encodes for an protein, e.g enzyme, may be expressed, for 
detection of the desired activity. The host cell can be a higher eukaryotic cell, such as a 
mammalian cell, or a lower eukaryotic cell, such as a yeast cell, or the host cell can be a 
prokaiyotic cell, such as a bacterial cell. Introduction of the construct into the host cell can be 
effected by transformation, calcium phosphate transfection, DEAE-Dextran mediated 
transfection, DMSO or electroporation (Davis, L., Dibner, M., Battey, L, Basic Methods in 
Molecular Biology, (1986)). 

As representative examples of appropriate hosts, there may be mentioned: bacterial cells, 
such as E. coli. Bacillus, Streptomyces, Salmonella typhimurium; fungal cells, such as yeast; 
insect cells such as Drosophila S2 and Spodoptera SJ9; animal cells such as CHO, COS or 
Bowes melanoma; adenoviruses; plant cells, etc. 

The selection of an appropriate host is deemed to be within the scope of those skilled in the art 
from the teachings herein. 
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Host cells are genetically engineered (transduced or transformed or transfected) w~th the 
vectors. The engineered host cells can be cultured in conventional nutrient media modified as 
appropriate for activating promoters, selecting transfonnants or amplifying genes. The culture 
conditions, such as temperature, pH and the like, are those previously used with the host cell 
selected for expression, and will be apparent to the ordinarily skilled artisan. 

The library may be screened for a specified protein, e.g. enzyme, activity by procedures 
knovm in the art. For example, the protein, e.g enzyme, activity may be screened for one or more 

of the six lUB classes; oxidoreductases, transferases, hydrolases, lyases, isomerases and ligases. 
The recombinant proteins, e.g. enzymes, which are determined to be positive for one or more of 
the lUB classes may then be rescreened for a more specific protein, e.g. enzyme, activity. 

Altematively, the library may be screened for a more specialized protein, e.g enzyme, 
activity. For example, instead of generically screening for hydrolase activity, the library may be 
screened for a more specialized activity, i.e. the type of bond on v^hich the hydrolase acts. Thus, 
for example, the library may be screened to ascertain those hydrolases which act on one or more 
specified chemical functionalities, such as: (a) amide (peptide bonds), i.e. proteases; (b) ester 
bonds, i.e. esterases and lipases; (c) acetals, i.e, glycosidases etc. 

The clones which are identified as having the specified protein, e g. enzyme, activity may 
then be sequenced to identify the DNA sequence encoding an protein, e.g. enzyme, having the 
specified activity. Thus, in accordance v^th the present invention it is possible to isolate and 
identify: (i) DNA encoding an protein, e.g enzyme, having a specified protein, e.g. enzyme, 
activity, (ii) proteins, e.g. enzymes, having such activity (inlcuding the amino acid sequence 
thereof) and (iii) produce recombinant proteins, e.g. enzymes, having such activity. 

The present invention may be employed for example, to identify uncultured 
microorganisms with proteins, e.g. enzymes, having, for example, the following activities which 
may be employed for the following uses: 
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Lipase/Esterase 

a. Enantioselective hydrolysis of esters (lipids)/ thioesters 

1 ) Resolution of racemic mixtures 

2) Synthesis of optically active acids or alcohols from mesodiesters 

b. Selective syntheses 

1 ) Regiospecific hydrolysis of carbohydrate esters 

2) Selective hydrolysis of cyclic secondary alcohols 

c. Synthesis of optically active esters, lactones, acids, alcohols 

1 ) Transesterification of acti vated/nonactivated esters 

2) Interesterification 

3) Optically active lactones from hydroxyesters 

4) Regio- and enantioselective ring opening of anhydrides 



d. 


Detergents 


e. 


Fat/Oil conversion 


f. 


Cheese ripening 


Protease 


a. 


Ester/amide synthesis 


b. 


Peptide synthesis 


c. 


Resolution of racemic mixtures of amino acid esters 


d. 


Synthesis of non-natural amino acids 


e. 


Detergents/protein hydrolysis 



Glycosidase/Glycosyl transferase 

a. Sugar/polymer synthesis 

b. Cleavage of glycosidic linkages to form mono, all-and oligosaccharides 

c. Synthesis of complex oligosaccharides 

d. Glycoside synthesis using UDP-galactosyl transferase 

e. Transglycosylation of disaccharides, glycosyl fluorides, aryl gaiactosides 

f. Glycosyl transfer in oligosaccharide synthesis 
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g. Diastereoselective cleavage of p-glucosylsulfoxides 

h. Asymmetric glycosylations 

i. Food processing 
j. Paper processing 

Phosphatase/Kinase 

a. Synthesis/hydrolysis of phosphate esters 

1 ) Regio-, enantioseiective phosphorylation 

2) Introduction of phosphate esters 

3) Synthesize phospholipid precursors 

4) Controlled polynucleotide synthesis 

b. Activate biological molecule 

c. Selective phosphate bond formation without protecting groups 
Mono/Dioxygenase 

a. Direct oxyfunctionalization of unactivated organic substrates 

b. Hydroxylation of alkane, aromatics, steroids 

c. Epoxidation of alkenes 

d. Enantioseiective sulphoxidation 

e. Regio- and stereoselective Bayer-Villiger oxidation 

Haloperoxidase 

a. Oxidative addition of haiide ion to nucleophilic sites 

b. Addition of hypohalous acids to olefinic bonds 

c. Ring cleavage of cyclopropanes 

d. Activated aromatic substrates converted to ortho and para derivatives 

e. 1 3 diketones converted to 2-halo-derivatives 

f. Heteroatom oxidation of sulfur and nitrogen containing substrates 
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g. Oxidation of enol acetates, alkynes and activated aromatic rings 



Lignin peroxidase/Diarylpropane peroxidase 

a. Oxidative cleavage of C-C bonds 

b. Oxidation of benzylic alcohols to aldehydes 
c Hydroxylation of benzylic carbons 

d. Phenol dimerization 

e. Hydroxylation of double bonds to form diols 

f. Cleavage of lignin aldehydes 

Epoxide hydrolase 

a. Synthesis of enantiomericaliy pure bioactive compounds 

b. Regio- and enantioselective hydrolysis of epoxide Aromatic and olefinic 
epaxidation by monoaxygenases to form epoxides 

d. Resolution of racemic epoxides 

e. Hydrolysis of steroid epoxides 

Nitrile hydratase/nitriluse 

a. Hydrolysis of aliphatic nitriles to carboxamides 

b. Hydrolysis of aromatic, heterocyclic, unsaturated aliphatic nitriles to 
corresponding acids 

c. Hydrolysis of acrylonitrile 

d. Production of aromatic and carboxamides, carboxylic acids 
(nicotinamide, picolinamide, isonicotinamide) 

e. Regioselective hydrolysis of acrylic dinitrile 

f. a-amino acids from a-hydroxynitriles 

Transaminase 

a. Transfer of amino groups into oxo-acids 
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Amidase/Acylase 

a. Hydrolysis of amides, amidines, and otlier C-N bonds 

b. Non-natural amino acid resolution and synthesis 



16 



Preparation of a Representative DNA Library 



The following outlines the procedures used to generate a gene library from a sample of 
the exterior surface of a whale bone found at 1240 meters depth in the Santa Catalina Basin 
5 during a dive expedition. 

Isolate DNA. 

ISOQUICK Procedure as per manufacturer's instructions. 
10 Shear DNA 

Vigorously push and pull DNA through a 25G double-hub 
needle and 1-cc syringes about 500 times. 
Check a small amount (0.5 on a 0.8% agarose gel to make 
sure the majority of the DNA is in the desired size range (about 3-6 kb). 

Add: 

HjO to a final volume of 405 ^l 
45^1 lOX Mung Bean Buffer 
2.0[d Mung Bean Nuclease (1 50 
hicubate 37°C, 15 minutes. 
Phenol/chloroform extract once. 
Chloroform extract once. 
Add 1 ml ice cold ethanol to precipitate. 
Place on ice for 10 minutes. 
Spin in microfuge, high speed, 30 minutes. 
Wash with 1 ml 70% ethanol. 
Spin in microfuge, high speed, 10 minutes and dry. 
to a final volume of 4.0 \xl 
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1. 



15 lU 

[J Blunt DNA 

ii 1. 

20 

2. 
3. 
4. 

25 5. 

6. 
7. 
8. 
9. 

30 



Methylate DNA 

1 . Gently resuspend DNA in 26 \i\ TE. 

2. Add: 

4.0^1 1 OX EcoR I Methylase Buffer 

0.5 lal SAM(32mM) 

5.0 fil EcoR I Methylase (40 }j/^l) 

3 . Incubate 3 7 ° , 1 hour 

Insure Blunt Ends 

1 . Add to the methylation reaction: 

5.0 ^il lOOmMMgClj 

8.0 |il dNTP mix (2.5 mM of each dGTP, dATP, dTTP, dCTP) 
4.0 |il Klenow(5fi/Ml) 

2. Licubate 12°C, 30 minutes. 

3. Add450^11XSTE. 

4. Phenol/chloroform extract once. 

5. Chloroform extract once. 

6. Add 1 ml ice cold ethanol to precipitate and place on ice for 10 minutes. 

7. Spin in microfuge, high speed, 30 minutes. 

8. Wash with 1 ml 70% ethanol. 

9. Spin in microfuge, high speed, 10 minutes and dry. 

Linker Ligation 

1 . Gently resuspend DNA in 7 \il Tris-EDTA (TE). 

2. Add: 

14 nl Phosohorvlated EcoR I linkers (200 ng,/nl) 

3 .0 ^il 1 OX Litigation Buffer 

3.0^1 lOmMrATP 

3.0 ^il T4 DNA Ligase (4Wu/nl) 

3 . Incubate 4°C, overnight. 
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EcoRI Cutback 

1 . Heat kill ligation reaction 68°C, 1 0 minutes. 

2. Add: 

237.9 H2O 

30 III 10X£coi? I Buffer 

2. 1 jil EcoR I Restriction Enzyme ( 100 u/jil) 

3 . Incubate 37°C, 1 .5 hours. 

4. Add 1.5 ^l 0.5 MEDIA. 

5. Place on ice. 

Sucrose Gradient (2.2 ml) Size Fractionation 

1 . Heat sample to 65°C, 1 0 minutes. 

2. Gently load on 2.2 ml sucrose gradient. 

3 . Spin in mini-ultracentrifuge, 45K, 20°C, 4 hours (no brake). 

4. Collect fractions by puncturing the bottom of the gradient tube with a 20G needle 
and allowing the sucrose to flow through the needle. Collect the first 20 drops in a 
Falcon 2059 tube then collect 10 1-drop fractions (labelled 1-10). Each drop is 
about 60 \il in volume. 

5. Run 5 \il of each fraction on a 0.8% agarose gel to check the 
size. 

6. Pool fractions 1-4 (-10-1.5 kb) and, in a separate tube, pool 
fractions 5-7 (about 5-0.5 kb). 

7. Add 1 ml ice cold ethanol to precipitate and place on ice for 10 minutes. 

8 . Spin in microfiige, high speed, 30 minutes. 

9. Wash with 1 ml 70% ethanol. 

10. Spin in microfiige, high speed, 1 0 minutes and dry. 

1 1 . Resuspend each in 1 0 |al TE buffer. 
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Test Ligation to Lambda Arms 

1 . Plate assay to get an approximate concentration. Spot 0.5 pi of tlie sample on 
agarose containing ethidium bromide along with standards (DNA samples of 
known concentration). View in UV light and estimate concentration compared to 
the standards. 

Fraction 1-4 => 1.0 ng/^l 1. Fraction 5-7 = 500 ng/\xl 

2. Prepare the following ligation reactions (5 pi reactions) and 
incubate 4°C, overnight: 



Sample 


HP 


lOX 

Ligase 
Buffer 


lOmM 
rATP 


Lambda 
arms 
(gtll and 
ZAP) 


Insert 
DNA 


T4DNA 
Ligase (4 
Wu/^) 


Fraction 1-4 


0.5 ^il 


0.5 III 


0.5 ^il 


1.0 ^il 


2.0 \il 


0.5 |il 


Fraction 5-7 


0.5 ^1 


0.5 ^il 


0.5 III 


1.0 ^l 


2.0 ]xl 


0.5 ^1 



Test Package and Plate 

1 . Package the ligation reactions followmg manufacturer's protocol. Package 2.5 \il per 
packaging extract (2 extracts per ligation). 

2. Stop packaging reactions with 500 ^1 SM buffer and pool packaging that came from the 
same ligation. 

3 . Titer 1 .0 \il of each on appropriate host (0D6W = LO) [XLEBlue MRF for ZAP® 
(Stratagene) and Y1088 for gtl 1] 

Add 200 \xl host (in mM MgS04) to Falcon 2059 tubes 
Inoculate with 1 \x\ packaged phage 
Incubate 37°C, 15 minutes 
Add about 3 ml 48°C top agar 

[50 ml stock containing 150 ^l IPTG (0.5M) and 300 ^il X-GAL (350 mg/ml)] 
Plate on 100mm plates and incubate 37°C, overnight. 
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4. 



Efficiency results: 
gtll: 



1.7 X 10" 



recombinants with 95% 



background 



ZAP ® (Stratagene): 4.2 x lO'* recombinants with 66% 



background 



Contaminants in the DNA sample may have inhibited the enzymatic reactions, though the 
sucrose gradient and organic extractions may have removed them. Since the DNA sample 
was precious, an effort was made to "fix" the ends for cloning: 

Re-Blunt DNA 

1 . Pool all left over DNA that was not ligated to the lambda arms (Fractions 1 -7) and 
add H2O to a final volume of 12 nl . Then add: 

143 ^l H2O 

20 ^1 1 OX Buffer 2 (from Stratagene's cDNA Synthesis Kit) 
23 |al Blunting dNTP (from Stratagene's cDNA Synthesis Kit) 
2.0 \xl Pfli (from Stratagene"s cDNA Synthesis 

2. Incubate 72''C, 30 minutes. 

3. Phenol/chloroform extract once. 

4. Chloroform extract once. 

5. Add 20 \iL 3M NaOAc and 400 nl ice cold ethanol to precipitate. 

6. Place at -20°C, overnight. 

7. Spin in microfuge, high speed, 3 0 minutes, 

8. Wash with 1 ml 70% ethanol. 

9. Spin in microfiige, high speed, 1 0 minutes and dry. 

(Do NOT Methylate DNA since it was already methylated in the first round of processing) 
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Adaptor Ligation 

1 . Gently resuspend DNA in 8 111 EcoR I adaptors (from Stratagene's cDNA 
Synthesis Kit). 

2. Add: 

1.0 ^l 1 OX Ligation Buffer 

1.0 |il lOmMrATP 

1 .0 III T4 DNA Ligase (4Wu/nl) 

3. Incubate 4°C, 2 days. 

(Do NOT cutback since using ADAPTORS this time. Instead, need to phosphoiylate) 

Phosphorylate Adaptors 

1 . Heat kill ligation reaction 70°C, 30 minutes. 
Add: 

1.0 nl lOX Ligation Buffer 
2.0 ^il lOmMrATF 
6.0^1 H2O 

1 .0 1^1 PNK (from Stratagene's cDNA Synthesis Kit). 

2. Incubate 37°C, 30 minutes. 

3. Add 3 1 \il H2O and 5 jil 1 OX STE. 

4. Size fractionate on a Sephacryl S-500 spin colimm (pool fractuions 1-3). 

5. Phenol/chloroform extract once. 

6. Chloroform extract once. 

7. Add ice cold ethanol to precipitate. 

8. Place on ice, 10 minutes. 

9. Spin in microfiige, high speed, 30 minutes. 

10. Wash with 1 ml 70% ethanol. 

1 1 . Spin in microfiige, high speed, 1 0 minutes and dry. 

12. Resuspend in 10.5 ^1 TE buffer. 
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Do not plate assay. Instead, ligate directly to arms as above escept use 2.5 ^1 of DNA and 
water. 



Package and titer as above. 

Efficiency results: 

gtll: 2,5 X 10^ recombinants with 2.5% background 

ZAP ® (Stratagene): 9.6 x 1 0^ recombinants with 0% background 

Amplification of Libraries (5.0 x 10'' recombinants from each library) 

1 . Add 3.0 ml host cells (OD^go^l .0) to two 50 ml conical tube, 

2. hioculate with 2.5 X 1 0^ pfu per conical tube. 

3. Incubate 37°C, 20 minutes. 

4. Add top agar to each tube to a final volume of 45 ml. 

5 . Plate the tube across five 1 50 mm plates. 

6. hicubate 6-8 hours or until plaques are about pin-head in size. 

7. Overlay with 8- 1 0 ml SM Buffer and place at 4°C overnight (with gentle rocking 
if possible). 

Harvest Phage 

1 . Recover phage suspension by pouring the SM buffer off each plate into a 50-ml 
conical tube. 

2. Add 3 ml chloroform, shake vigorously and incubate at room 
temperature, 15 minutes. 

3. Centrifuge at 2K rpm, 10 minutes to remove cell debris, 

4. Pour supematant into a sterile flask, add 500 pi chloroform. 

5. Store at 4°C. 

Titer Amplified Library 

1 . Make serial dilutions: 

10*^ = 1 ^1 amplified phage in 1 ml SM Buffer 
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10-^ - 1 |il of the 10-^ dilution in 1 ml SM Buffer 

2. Add 200 jil host (in, 1 0 mM MgS04) to two tubes 

3. Inoculate one with 10 ^il 10"^ dilution (10"^). 

4. Inoculate the other with 1 |il 10"^ dilution (10"^). 

5. Incubate 37°C, 15 minutes. 

6. Add about 3 ml 48°C top agar. 

[50 ml stock containing 150 |il IPTG (0.5M) and 375 |il 
X-GAL (350 mg/ml)] 

7. Plate on 100 mm plates and incubate 37°C, overnight 

8. Results: 

gtll: IJxlO^Vml 
ZAP® (Stratagene): 2.0 x 1 O^Vml 

Example 2 
Enzymatic Activity Assay 

The following is a representative example of a procedure for screening an expression 
library prepared in accordance with Example 1. In the following, the chemical characteristic 
Tiers are as follows: 
Tier 1 : Hydrolase 

Tier 2 : Amide, Ester and Acetal 

Tier 3 : Divisions and subdivisions are based upon the differences between individual 

substrates which are covalently attached to the functionality of Tier 2 undergoing reaction; as 
well as substrate specificity. 

Tier 4: The two possible enantiomeric products which the protein, e.g. enzyme, may 

produce from a substrate. 



24 



Although the following example is specifically directed to the above mentioned tiers, the general 
procedures for testing for various chemical characteristics is generally applicable to substrates 
other than those specifically referred to in diis Example. 



Screening for Tier 1-hydrolase; Tier 2-amide. Plates of the library prepared as described in 
Example 1 are used to multiply inoculate a single plate containing 200 |il of LB Amp/Meth, 
glycerol in each well. This step is performed using the High Density Replicating Tool (HDRT) 
of the Beckman Biomek with a 1% bleach, water, isopropanol, air-dry sterilization cycle 
between each inoculation. The single plate is grown for 2h at 37°C and is then used to inoculate 
two white 96-well Dynatech microtiter daughter plates containing 250 |il of LB Amp/Meth, 
glycerol in each well. The original single plate is incubated at 3TC for 18h, then stored at 80°C. 
The two condensed daughter plates are incubated at 3TC also for 18 h. The condensed daughter 
plates are then heated at TC'C for 45 min. to kill the cells and inactivate the host Exoli proteins, 
e.g. enzymes. A stock solution of 5mg/mL morphourea phenylalanyl-7-amino-4-trifluoromethyl 
coumarin (MuPheAFC, the ^substrate') in DMSO is diluted to 600 ^iM with 50 mM pH 7.5 
Hepes buffer containing 0.6 mg/mL of the detergent dodecyl maitoside. 



Fifty |iL of the 600 (iM MuPheAFC solution is added to each of the wells of the white 
condensed plates with one 100 |iL mix cycle using the Biomek to yield a final concentration of 
substrate of -100 |liM. The fluorescence values are recorded (excitation = 400 nm, emission = 
505 nm) on a plate reading fluorometer immediately after addition of the substrate (t=0). The 
plate is incubated at 70^C for 100 min, then allowed to cool to ambient temperature for 1 5 
additional minutes. The fluorescence values are recorded again (t==100). The values at t=0 are 
subtracted from the values at t=100 to determine if an active clone is present. 




MuPheAFC 
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The data will indicate whedier one of the clones in a particular well is hydrolyzing the 
substrate. In order to determine the individual clone which carries the activity, the source library 
plates are thawed and the individual clones are used to singly inoculate a new plate containing 
LB Amp/Tvleth. glycerol. As above, the plate is incubated at ST^'C to grow the cells, heated at 
TO^'C to inactivate the host proteins, e.g, enzymes, and 50 {iL of 600 (iM MuPheAFC is added 
using the Biomek. Additionally three other substrates are tested. They are methyl umbelliferone 
heptanoate. the CBZ-arginine rhodamine derivative, and fluorescein-conjugated casein (-3.2 
mol fluorescein per mol of casein). 




mmthfl umbeJUferwie bcptUMte (CBZ^rfialiic), rbodamlac 1 10 



The umbelliferone and rhodamine are added as 600 jiiM stock solutions in 50 \xL of Hepes 
buffer. The fluorescein conjugated casein is also added in 50 at a stock concentration of 20 
and 200 mg/mL. After addition of the substrates the t=0 fluorescence values are recorded, the 
plate is incubated at 70°C, and the t= 100 min. values are recorded as above. 

These data indicate which plate the active clone is in, where the arginine rhodamine 
derivative is also tumed over by this activity, but the lipase substrate, methyl umbelliferone 
heptanoate, and protein, fluorescein-conjugated casein, do not function as substrates, the Tier I 
classification is 'hydrolase' and the Tier 2 classification is amide bond. No cross reactivity should 
be seen with the Tier 2-ester classification. 

As shown in Figure 1 , a recombinant clone from the library which has been characterized 
in Tier 1 as hydrolase and in Tier 2 as amide may then be tested in Tier 3 for various 
specificities. In Figure 1 , the various classes of Tier 3 are followed by a parenthetical code 
which identifies the substrates of Table 1 which are used in identifying such specificities of Tier 
3. 
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As shown in Figures 2 and 3, a recombinant clone from the library which has been 
characterized in Tier 1 as hydrolase and in Tier 2 as ester may then be tested in Tier 3 for various 
specificities. In Figures 2 and 3, the various classes of Tier 3 are followed by a parenthetical 
code which identifies the substrates of Tables 3 and 4 which are used in identifying such 
specificities of Tier 3. In Figures 2 and 3, R2 represents the alcohol portion of the ester and R' 
represents the acid portion of the ester. 

As shown in Figure 4, a recombinanl clone from the library which has been characterized 
in Tier 1 as hydrolase and in Tier 2 as acetal may then be tested in Tier 3 for various 
specificities. In Figure 3, the various classes of Tier 3 are followed by a parenthetical code which 
identifies the substrates of Table 5 which are used in identifying such specificities of Tier 3. 

Proteins, e,g enzymes, may be classified in Tier 4 for the chirality of the product(s) 
produced by the enzyme. For example, chirai amino esters may be determined using at least the 
following substrates: 



For each substrate which is turned over the enantioselectivity value, E, is determined according 
to the equation below: 

ln[(l-c(l+ee^] 
ln[(l.c(l^ccp] 

where eCp = the enantiomeric excess (ee) of the hydrolyzed product and c = the percent 
conversion of the reaction. See Wong and Whitesides, Proteins, e.g. enzymes, in Synthetic 
Organic Chemistry, 1994, Elsevier, Tarrytown, New York, pp. 9-12. 
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The enantiomeric excess is determined by either chiral high performance liquid chromatography 
(HPLC) or chiral capillary electrophoresis (CE). Assays are performed as follows: two hundred 
pL of the appropriate buffer is added to each well of a 96-well white microtiter plate, followed 
by 50 i^L of partially or completely purified protein, e.g enzyme, solution; 50 \xL of substrate is 
added and the increase in fluorescence monitored versus time until 50% of the substrate is 
consumed or the reaction stops, whichever comes first. 

Example 3 

Construction of a Stable^ Large Insert Picoplankton Genomic DNA Library 

Figure 5 shows an overview of the procedures used to construct an environmental library 
from a mixed picoplankton sample, A stable, large insert DNA libraiy representing picoplankton 
genomic DNA was prepared as follows. 

Cell collection and preparation of DNA. Agarose plugs containing concentrated 
picoplankton cells were prepared from samples collected on an oceanographic cruise from 
Newport, Oregon to Honolulu, Hawaii. Seawater (30 liters) was collected in Niskin botties, 
screened through 10 |im Nitex, and concentrated by hollow fiber filtration (Amicon DCIO) 
through 30,000 MW cutoff polyfulfone filters. The concentrated bacterioplankton cells were 
collected on a 0.22 11m, 47 mm Durapore filter, and resuspended in 1 ml of 2X STE buffer (IM 
NaCI, O.IM EDTA, 10 mM Tris, pH 8,0) to a final density of approxunately 1x10^° cells per 
ml. The cell suspension was mixed v^th one volume of 1% molten Seaplaque LMP agarose 
(FMC) cooled to 40°C, and then immediately drawn into a 1 ml syringe. The syringe was sealed 
with parafilm and placed on ice for 10 min. The cell-containing agarose plug was extruded into 
10 ml of Lysis Buffer (lOmM Tris pH 8.0, 50 mM NaCI, O.IM EDTA, 1% Sarkosyl, 0.2% 
sodium deoxycholate, 1 mg/ml lysozyme) and incubated at 37*^C for one hour. The agarose plug 
was then transferred to 40 mis of ESP Buffer (1% Sarkosyl, 1 mg/ml proteinase K, in 0.5M 
EDTA), and incubated at 55°C for 16 hours. The solution was decanted and replaced vnth fresh 
ESP Buffer, and incubated at 55°C for an additional hour. The agarose plugs were then placed in 
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50 niM EDTA and stored af^4°C shipboard for the duration of the oceanographic cruise. 

One slice of an agarose plug (72 prepared from a sample collected off the Oregon 
coast was dialyzed overnight at 4°C against I mL of buffer A (1 OOmM NaCI, lOmM Bis Tris 
Propane-HCl 100 jig/ml acetylated BSA: pH 7.0 (@ 25°C) in a 2 mL microcentrifuge tube. The 
solution was replaced with 250 |.il of fresh buffer A containing 10 mM MgCl2 and 1 mM DTT 
and incubated on a rocking platform for 1 hr at room temperature. The solution was then 
changed to 250 [il of the same buffer containing 4U of SauSAl (NEB), equilibrated to 37''C in a 
water bath, and then incubated on a rocking platform in a 37°C incubator for 45 min. The plug 
was transferred to a 1.5 ml microcentrifuge tube and incubated at 68°C for 30 min to inactivate 
the protein, e.g. enzyme, and to melt the agarose. The agarose was digested and the DNA 
dephosphorylased using Gelase and HK-phosphatase (Epicentre), respectively, according to the 
manufacturer's recommendations. Protein was removed by gentle phenol/chloroform extraction 
and the DNA was ethanol precipitated, pelleted, and then washed v^th 70% ethanoL This 
partially digested DNA was resuspended in sterile H2O to a concentration of 2.5 ng/jil for 
ligation to the pFOSl vector. 

PGR amplification results from several of the agarose plugs (data not shown) indicated 
the presence of significant amounts of archaeal DNA. Quantitative hybridization experiments 
using rRNA extracted from one sample, collected at 200 m of depth off the Oregon Coast, 
indicated that planktonic archaea in (this assemblage comprised approximately 4.7% of the total 
picoplankton biomass (this sample corresponds to "PACI"-200 m in Table 1 of DeLong et aL^ 
high abundance of Archaea in Antarctic marine picoplankton, Nature^ 577:695-698, 1994). 
Results from archaeal-biased rDNA PGR amplification performed on agarose plug lysates 
confirmed the presence of relatively large amoxmts of archaeal DNA in this sample. Agarose 
plugs prepared from this picoplankton sample were chosen for subsequent fosmid library 
preparation. Each 1 ml agarose plug from this site contained approximately 7.5 x 10^ cells, 
therefore approximately 5.4 x 10^ cells were present in the 72 \x\ slice used in the preparation of 
the partially digested DNA. 
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Vector arms were prepared from pFOSI as described (Kim et al. Stable propagation of 
casmid sized human DNA inserts in an F factor based vector, Nucl Acids Res., 20: 10832-10835, 
1992). Briefly, the piasmid was completely digested with Astll, dephosphorylated with HK 
phosphatase, and then digested with BamHI to generate two anns, each of which contained a cos 
site in the proper orientation for cloning and packaging ligated DNA between 35-45 kbp. The 
partially digested picoplankton DNA, isolated by partial fragment gel electrophoresis (PFGE), 
was ligated overnight to the PFOSI arms in a 15 ^il ligation reaction containing 25 ng each of 
vector and insert and lU of T4 DNA ligase (Boehringer-Mannheim). The ligated DNA in four 
microliters of this reaction was in vitro packaged using the Gigapack XL packaging system 
(Stratagene), the fosmid particles transfected to E. coli strain DHIOB (BRL), and the cells spread 
onto LB^niis plates. The resultant fosmid clones were picked into 96-well microliter dishes 
containing LB^^i5 supplemented with 7% glycerol. Recombinant fosmids, each containing cat 40 
kb of picoplankton DNA insert, yielded a library of 3,552 fosmid clones, containing 
approximately 1.4 x 10^ base pairs of cloned DNA, All of the clones examined contained inserts 
ranging from 38 to 42 kbp. This library was stored frozen at -80*^C for later analysis. 

Numerous modifications and variations of the present invention are possible in light of 
the above teachings; therefore, within the scope of the claims, the invention may be practiced 
other than as particularly described. 
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Table 1 



A2 

FlutHccccin c«iniupaicd casein (3.2 moi flucKcjcetn/mol c«i:ini 

CflZ AJ*-AMC 

t-BOC-Al«.AI«-Atp.AMC 

mcciny!-Ai«-Cly.Leu-AMC 

CBZArg-AMC 

CBZ-Mct-AMC 

tnoqshowet-Phc-AMC 

( BOC * t-bmoxy ctft>onyt CBZ « c«rivHiyl bcfirylcxy. 
AMC = 7-«nino-4-fne(hyl cEMonarin 



AA3 AB3 AC3 

o 

AD3 

Ruorcscein conjugtled c«s«in 

i-BOC- AU-Alm-Axf^-AFC 
CBZ- AU-AU-Lyf-Af=C 
<uocmyt*AU-AU-Ph«*AFC 
ftiocmy 1- AJ<-Gly-Leti - AFC 

AFC s 7-«inino^-erinuorQtneUiyt cnum vin.) 



^ AH3 

Flunrcscein conjugftied 

fuccinyl-Ala-AUIticAFC 



AF3 



AG3 

CBZ Ai«-Ai«-Ly«.AFC 
CBZ Art-AFC 



CBZ-Hk-AFC 
CBZ-Tjp-AHC 



I BOC. AU-Al«.A.tp-AFC 

CB7-Aitp.AFC y!^J3 



««ccmyI.Al3(;iy.|.eu.AR' 

cbz-au-ak: 

CD7 ^^ewr.AK' 
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Table 3 



LI 13 ^ ^ LU 



And all ofL2 



U3 



CHj 



LfG LU 



LN3 

L03 
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Table 4 




4-mcthyI umbcUifcrone 
wherein R = 



G2 ^-D-galaciosc 

^-D-glucose 

0-D-glucuronidc 
GB3 /?-D-cellotriosidc 

/3-B-ccIlobiopyranoside 
GC3 ^-D-galactosc 

a-D-galactosc 
GD3 /3-D-giUcase 

a-D-glucose 
GE3 ^-D-glucuronidc 
GI3 0-D-N,N-diacctylchitobiose 
GJ3 ^-D-fucose 

cr-L-fucosc 

j3-L-fucosc 
GK3 j0-D-fnannosc 

a-D-mannosc 



non-UmbcIIifcryl substrates 

GA3 amylose [polyglucan a 1.4 linkages], amylopcctin 

[polyglucan branching eel, 6 linkages] 
GF3 xylan [poly l,4.D-xylan] 

GG3 amylopcctin, pullulan 

GH3 sucrose, fructofuranosidc 
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WHAT IS CLAIMED IS: 

1 . A combinatorial gene expression library, comprising a pool of expression constructs, 
each expression construct containing one or more cDNA or genomic DNA fragments, wherein 
the cDNA or genomic DNA fragments in the pool of expression constructs are derived from a 
plurality of species of donor organisms, and wherein the cDNA or genomic DNA fragments in 
each expression construct are operably-associated each with one or more regulatory regions that 
drives expression of genes encoded by the cDNA or genomic DNA fragments in an appropriate 
host organism. 

2. A method for making a combinatorial gene expression hbrary, comprising hgating a 
DNA vector to one or more cDNA or genomic DNA fragments to generate a Ubrary of 
expression constructs, wherein the cDNA or genomic DNA fragments in the library of 
expression constructs are obtained from a plurality of species of donor organisms, and wherein 
genes contained in the cDNA or genomic DNA fragments are operably-associated with their 
native or exogenous regulatory regions which drive expression of the genes in an appropriate 
host cell. 

3 . A method for screening a gene expression library or a compound of interest, comprising: 

(a) culturing the gene expression library of claim 1 ; and 

(b) detecting a signal generated by the reporter regunen; thereby identifying a clone 
which produces the compound. 
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ABSTRACT 



Disclosed is a process of screening clones having DNA from an uncultivated 
microorganism for a specified protein, e,g, enzyme, activity by screening for a specified protein, 
e.g. enzyme, activity in a library of clones prepared by (i) recovering DNA from a DNA 
population derived from at least one uncultivated microorganism; and (ii) transforming a host 
with recovered DNA to produce a library of clones which is screened for the specified protein, 
e.g. enzyme, activity. 
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Tier I 



Tier 11 




Figure 5 



Environmental Library Construction m 

pFOSl 



1- Concentrate bacteria, digest protein and preserve high NfW DNA 

Jl^ 1^ • • • 



Agarose "noodle" 



Proteinaise - JC detergent 



T T T 



1 Partially digest DNA and select 40 kbp fragments by 
FFGE or by X-packaging (step 3) 




3- Ligate to fosmid aims, package and transf ect to £- co/f. 
Array library in microtiter plates. 
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